Leave this room with a shallow understanding of
| Model | ACF | PACF |
|---|---|---|
| AR(p) | Tails off slowly | Cuts off after lag p |
| MA(q) | Cuts off after lag q | Tails off slowly |
| ARMA(p,q) | Tails off slowly | Tails off slowly |
We often assume so-called Gaussian white noise, whereby
\[ w_t \sim \text{N}(0,\sigma^2) \]
and the following apply as well
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
tt <- rnorm(100)
plot.ts(tt, ylab = expression(italic(w[t])))
acf(tt, main = "")
title(expression(w[t] %~% N(0,1)), outer = TRUE)
A time series \(\{x_t\}\) is a random walk if
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
tt <- cumsum(rnorm(100))
plot.ts(tt, ylab = expression(italic(x[t])))
acf(tt, main = "")
title(expression(list(x[t] == x[t-1] + w[t], w[t] %~% N(0,1))), outer = T)
A biased random walk (or random walk with drift) is written as
\[ x_t = x_{t-1} + u + w_t \]
where \(u\) is the bias (drift) per time step and \(w_t\) is white noise
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
xx <- ww <- rnorm(100)
uu <- 1
for(t in 2:100) {
xx[t] <- xx[t-1] + uu + ww[t]
}
plot.ts(xx, ylab = expression(italic(x[t])))
acf(tt, main = "")
title(expression(list(x[t] == x[t-1] + 1 + w[t], w[t] %~% N(0,1))), outer = T)
First-differencing a biased random walk yields a constant mean (level) \(u\) plus white noise
\[ \begin{align} x_t &= x_{t-1} + u + w_t \\ x_t - x_{t-1} &= x_{t-1} - x_{t-1} + u + w_t \\ x_t - x_{t-1} &= u + w_t \\ \Delta x_t &= u + w_t \end{align} \]
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
plot.ts(diff(xx), ylab = expression(paste(Delta,italic(x[t]))))
acf(tt, main="")
title(expression(list(paste(Delta, x[t]) == 1 + w[t],w[t] %~% N(0,1))), outer = T, line = .7)
Autoregressive models treat a current state of nature as a function its past state(s)
An autoregressive model of order p, or AR(p), is defined as
\[ x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \dots + \phi_p x_{t-p} + w_t \]
where we assume
set.seed(123)
### the 4 AR coefficients
ARp <- c(0.7, 0.2, -0.1, -0.3)
### empty list for storing models
AR_mods <- vector("list", 4L)
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(p in 1:4) {
## assume SD=1, so not specified
AR_mods[[p]] <- arima.sim(n=50, list(ar=ARp[1:p]))
plot.ts(AR_mods[[p]], las = 1,
ylab = expression(italic(x[t])))
mtext(side = 3, paste0("AR(",p,")"),
line = 0.3, adj = 0, cex = 0.8)
}
Stationarity means ‘not changing in time’ in the context of time-series models. More generally, if all statistical properties of a time-series are time-constant then a time series is ‘stationary’.
Specifically, stationary time series have the following properties
We seek a means for identifying whether our AR(p) models are also stationary
We can write out an AR(p) model using the backshift operator
\[ x_t = \phi_1 x_{t-1} + \phi_2 x_{t-2} + \dots + \phi_p x_{t-p} + w_t \\ \Downarrow \\ \begin{align} x_t - \phi_1 x_{t-1} - \phi_2 x_{t-2} - \dots - \phi_p x_{t-p} &= w_t \\ (1 - \phi_1 \mathbf{B} - \phi_2 \mathbf{B}^2 - \dots - \phi_p \mathbf{B}^p) x_t &= w_t \\ \phi_p (\mathbf{B}) x_t &= w_t \\ \end{align} \]
If we treat \(\mathbf{B}\) as a number (or numbers), we can out write the characteristic equation as
\[ \phi_p (\mathbf{B}) x_t = w_t \\ \Downarrow \\ \phi_p (\mathbf{B}) = 0 \]
To be stationary, all roots of the characteristic equation must exceed 1 in absolute value
\[ \begin{align} x_t &= 0.5 x_{t-1} + w_t \\ x_t - 0.5 x_{t-1} &= w_t \\ (1 - 0.5 \mathbf{B})x_t &= w_t \\ \Downarrow \\ 1 - 0.5 \mathbf{B} &= 0 \\ -0.5 \mathbf{B} &= -1 \\ \mathbf{B} &= 2 \\ \end{align} \]
This model is stationary because \(\mathbf{B} > 1\)
\[ \begin{align} x_t &= -0.2 x_{t-1} + 0.4 x_{t-2} + w_t \\ x_t + 0.2 x_{t-1} - 0.4 x_{t-2} &= w_t \\ (1 + 0.2 \mathbf{B} - 0.4 \mathbf{B}^2)x_t &= w_t \\ \Downarrow \\ 1 + 0.2 \mathbf{B} - 0.4 \mathbf{B}^2 &= 0 \\ \Downarrow \\ \mathbf{B} \approx -1.35 ~ \text{and}& ~ \mathbf{B} \approx 1.85 \end{align} \]
This model is not stationary because only one \(\mathbf{B} > 1\)
Consider our random walk model
\[ \begin{align} x_t &= x_{t-1} + w_t \\ x_t - x_{t-1} &= w_t \\ (1 - 1 \mathbf{B})x_t &= w_t \\ \Downarrow \\ 1 - 1 \mathbf{B} &= 0 \\ -1 \mathbf{B} &= -1 \\ \mathbf{B} &= 1 \\ \end{align} \]
Random walks are not stationary because \(\mathbf{B} = 1 \ngtr 1\)
set.seed(123)
### list description for AR(1) model with small coef
AR_pos <- list(order=c(1,0,0), ar=0.7, sd=0.1)
### list description for AR(1) model with large coef
AR_neg <- list(order=c(1,0,0), ar=-0.7, sd=0.1)
### simulate AR(1)
AR1_pos <- arima.sim(n=500, model=AR_pos)
AR1_neg <- arima.sim(n=500, model=AR_neg)
### get y-limits for common plots
ylm1 <- c(min(AR1_pos[1:50],AR1_neg[1:50]), max(AR1_pos[1:50],AR1_neg[1:50]))
### set the margins & text size
par(mfrow = c(1,2), mai = c(.6,0.6,0.1,0.2), oma = c(0,0,1.5,0),
mgp = c(1.7,.6,0))
### plot the ts
plot.ts(AR1_pos[1:50], ylim=ylm1, las = 1,
ylab=expression(italic(x)[italic(t)]))
mtext(side = 3, expression(paste(phi[1]," = 0.7")),
line = 0.4, adj = 0, cex = 0.8)
plot.ts(AR1_neg[1:50], ylim=ylm1, las = 1,
ylab=expression(italic(x)[italic(t)]))
mtext(side = 3, expression(paste(phi[1]," = -0.7")),
line = 0.4, adj = 0, cex = 0.8)
set.seed(123)
### list description for AR(1) model with small coef
AR_bg <- list(order=c(1,0,0), ar=0.9, sd=0.1)
### list description for AR(1) model with large coef
AR_sm <- list(order=c(1,0,0), ar=0.1, sd=0.1)
### simulate AR(1)
AR1_bg <- arima.sim(n=500, model=AR_bg)
AR1_sm <- arima.sim(n=500, model=AR_sm)
### get y-limits for common plots
ylm2 <- c(min(AR1_bg[1:50],AR1_sm[1:50]), max(AR1_bg[1:50],AR1_sm[1:50]))
### set the margins & text size
par(mfrow = c(1,2), mai = c(.6,0.6,0.1,0.2), oma = c(0,0,1.5,0),
mgp = c(1.7,.6,0))
### plot the ts
plot.ts(AR1_bg[1:50], ylim = ylm2, las = 1,
ylab = expression(italic(x)[italic(t)]),
main = "")
mtext(side = 3, expression(paste(phi[1]," = 0.9")),
line = 0.4, adj = 0, cex = 0.8)
plot.ts(AR1_sm[1:50], ylim = ylm2, las = 1,
ylab = expression(italic(x)[italic(t)]),
main = "")
mtext(side = 3, expression(paste(phi[1]," = 0.1")),
line = 0.4, adj = 0, cex = 0.8)
Recall that the autocorrelation function (\(\rho_k\)) measures the correlation between \(\{x_t\}\) and a shifted version of itself \(\{x_{t+k}\}\)
### set the margins & text size
par(mfrow=c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### plot the ts
plot.ts(AR1_pos[1:50], ylim=ylm1, las = 1,
ylab=expression(italic(x)[italic(t)]),
main = "")
mtext(side = 3, expression(paste(phi[1]," = 0.7")),
line = 0.2, adj = 0, cex = 0.8)
acf(AR1_pos, lag.max = 20, las = 1)
plot.ts(AR1_neg[1:50], ylim=ylm1, las = 1,
ylab=expression(italic(x)[italic(t)]),
main = "")
mtext(side = 3, expression(paste(phi[1]," = -0.7")),
line = 0.2, adj = 0, cex = 0.8)
acf(AR1_neg, lag.max = 20, las = 1)
### set the margins & text size
par(mfrow=c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### plot the ts
plot.ts(AR1_bg[1:50], ylim = ylm2, las = 1,
ylab = expression(italic(x)[italic(t)]),
main = "")
mtext(side = 3, expression(paste(phi[1]," = 0.9")),
line = 0.2, adj = 0, cex = 0.8)
acf(AR1_bg, lag.max = 20, las = 1)
plot.ts(AR1_sm[1:50], ylim = ylm2, las = 1,
ylab = expression(italic(x)[italic(t)]),
main = "")
mtext(side = 3, expression(paste(phi[1]," = 0.1")),
line = 0.2, adj = 0, cex = 0.8)
acf(AR1_sm, lag.max = 20, las = 1)
Recall that the partial autocorrelation function (\(\phi_k\)) measures the correlation between \(\{x_t\}\) and a shifted version of itself \(\{x_{t+k}\}\), with the linear dependence of \(\{x_{t-1},x_{t-2},\dots,x_{t-k-1}\}\) removed
### set 3 AR coefficients
ARp3 <- list(c(0.7, 0.2, -0.1), c(-0.7, 0.2, 0.1))
expr <- list(expression(paste("AR(3) with ", phi[1], " = 0.7, ",
phi[2], " = 0.2, ", phi[3], " = -0.1")),
expression(paste("AR(3) with ", phi[1], " = -0.7, ",
phi[2], " = 0.2, ", phi[3], " = 0.1")))
### empty list for storing models
AR3_mods <- vector("list", 2L)
par(mfrow = c(2,3), mai = c(0.5,0.4,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.8, .7, 0), cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(p in 1:2) {
## assume SD=1, so not specified
AR3_mods[[p]] <- arima.sim(n=5000, list(ar=ARp3[[p]]))
plot.ts(AR3_mods[[p]][1:50], las = 1,
ylab = expression(italic(x[t])))
acf(AR3_mods[[p]], lag.max = 20,
las = 1, main = "")
mtext(side = 3, expr[[p]],
line = 0.2, adj = 0.5, cex = .8)
pacf(AR3_mods[[p]], lag.max = 20,
las = 1, main = "")
}
### empty list for storing models
pacf_mods <- vector("list", 4L)
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(p in 1:4) {
pacf_mods[[p]] <- arima.sim(n=5000, list(ar=ARp[1:p]))
pacf(pacf_mods[[p]], lag.max = 15,
las = 1, main = "")
mtext(side = 3, paste0("AR(",p,")"),
line = 0.2, adj = 0)
}
Do you see the link between the order p and lag k?
| Model | ACF | PACF |
|---|---|---|
| AR(p) | Tails off slowly | Cuts off after lag p |
Moving average models are most commonly used for forecasting a future state
A moving average model of order q, or MA(q), is defined as
\[ x_t = w_t + \theta_1 w_{t-1} + \theta_2 w_{t-2} + \dots + \theta_q w_{t-q} \]
where \(w_t\) is white noise
Each of the \(x_t\) is a sum of the most recent error terms
A moving average model of order q, or MA(q), is defined as
\[ x_t = w_t + \theta_1 w_{t-1} + \theta_2 w_{t-2} + \dots + \theta_q w_{t-q} \]
where \(w_t\) is white noise
Each of the \(x_t\) is a sum of the most recent error terms
Thus, all MA processes are stationary because they are finite sums of stationary WN processes
### the 4 MA coefficients
MAq <- c(0.7, 0.2, -0.1, -0.3)
### empty list for storing models
MA_mods <- vector("list", 4L)
### loop over orders of q
for(q in 1:4) {
## assume SD=1, so not specified
MA_mods[[q]] <- arima.sim(n=500, list(ma=MAq[1:q]))
}
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(q in 1:4) {
## assume SD=1, so not specified
plot.ts(MA_mods[[q]][1:50], las = 1,
ylab = expression(italic(x[t])))
mtext(side = 3, paste0("MA(",q,")"),
line = 0.2, adj = 0, cex = 0.8)
}
MA1 <- arima.sim(n=50, list(ma=c(0.7)))
MA2 <- arima.sim(n=50, list(ma=c(-1, 0.7)))
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
plot.ts(MA1, las = 1,
ylab = expression(italic(x[t])))
mtext(side = 3,
expression(MA(1):~italic(x[t])==~italic(w[t])+0.7~italic(w[t-1])),
line = 0.2, adj = 0, cex = 0.8)
plot.ts(MA2, las = 1,
ylab = expression(italic(x[t])))
mtext(side = 3,
expression(MA(2):~italic(x[t])==~italic(w[t])-~italic(w[t-1])+0.7~italic(w[t-2])),
line = 0.2, adj = 0, cex = 0.8)
It is possible to write an AR(p) model as an MA(\(\infty\)) model
For example, consider an AR(1) model
\[ \begin{align} x_t &= \phi x_{t-1} + w_t \\ x_t &= \phi (\phi x_{t-2} + w_{t-1}) + w_t \\ x_t &= \phi^2 x_{t-2} + \phi w_{t-1} + w_t \\ x_t &= \phi^3 x_{t-3} + \phi^2 w_{t-2} + \phi w_{t-1} + w_t \\ & \Downarrow \\ x_t &= w_t + \phi w_{t-1}+ \phi^2 w_{t-2} + \dots + \phi^k w_{t-k} + \phi^{k+1} x_{t-k-1} \end{align} \]
If our AR(1) model is stationary, then
\[ \lvert \phi \rvert < 1 ~ \Rightarrow ~ \lim_{k \to \infty} \phi^{k+1} = 0 \]
so
\[ \begin{align} x_t &= w_t + \phi w_{t-1}+ \phi^2 w_{t-2} + \dots + \phi^k w_{t-k} + \phi^{k+1} x_{t-k-1} \\ & \Downarrow \\ x_t &= w_t + \phi w_{t-1}+ \phi^2 w_{t-2} + \dots + \phi^k w_{t-k} \end{align} \] \[ \]
An MA(q) process is invertible if it can be written as a stationary autoregressive process of infinite order without an error term
For example, consider an MA(1) model
\[ \begin{align} x_t &= w_t + \theta w_{t-1} \\ & \Downarrow \\ w_t &= x_t - \theta w_{t-1} \\ w_t &= x_t - \theta (x_{t-1} - \theta w_{t-2}) \\ w_t &= x_t - \theta x_{t-1} - \theta^2 w_{t-2} \\ & ~~\vdots \\ w_t &= x_t - \theta x_{t-1} + \dots + (-\theta)^k x_{t-k} + (-\theta)^{k+1} w_{t-k-1} \\ \end{align} \]
If we constrain \(\lvert \theta \rvert < 1\), then
\[ \lim_{k \to \infty} (-\theta)^{k+1} w_{t-k-1} = 0 \]
and
\[ \begin{align} w_t &= x_t - \theta x_{t-1} + \dots + (-\theta)^k x_{t-k} + (-\theta)^{k+1} w_{t-k-1} \\ & \Downarrow \\ w_t &= x_t - \theta x_{t-1} + \dots + (-\theta)^k x_{t-k} \\ w_t &= x_t + \sum_{k=1}^\infty(-\theta)^k x_{t-k} \end{align} \]
Q: Why do we care if an MA(q) model is invertible?
A: It helps us identify the model’s parameters
But we select the first model because it is invertable (the coefficient is less than 1).
### set 3 AR coefficients
set.seed(123)
MAp3 <- list(c(0.7), c(-0.7, 0.5, 0.2))
expr <- list(expression(paste("MA(1) with ", theta[1], " = 0.7, ")),
expression(paste("MA(3) with ", theta[1], " = -0.7, ",
theta[2], " = 0.5, ", theta[3], " = 0.2")))
### empty list for storing models
MA3_mods <- vector("list", 2L)
par(mfrow = c(2,3), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(p in 1:2) {
## assume SD=1, so not specified
MA3_mods[[p]] <- arima.sim(n=500, model = list(ma=MAp3[[p]]))
plot.ts(MA3_mods[[p]][1:50], las = 1,
ylab = expression(italic(x[t])))
acf(MA3_mods[[p]], lag.max = 10,
las = 1, main = "")
mtext(side = 3, expr[[p]],
line = 0.2, adj = 0.5, cex = .8)
pacf(MA3_mods[[p]], lag.max = 10,
las = 1, main = "")
}
### the 4 MA coefficients
set.seed(123)
MAq <- c(0.7, 0.5, -0.2, -0.3)
### empty list for storing models
acf_mods <- vector("list", 4L)
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(q in 1:4) {
acf_mods[[p]] <- arima.sim(n=5000, list(ma=MAq[1:q]))
acf(acf_mods[[p]], lag.max = 15,
las = 1, main = "")
mtext(side = 3, paste0("MA(",q,")"),
line = 0.2, adj = 0, cex = .8)
}
Do you see the link between the order q and lag k?
| Model | ACF | PACF |
|---|---|---|
| AR(p) | Tails off slowly | Cuts off after lag p |
| MA(q) | Cuts off after lag q | Tails off slowly |
An autoregressive moving average, or ARMA(p,q), model is written as
\[ x_t = \phi_1 x_{t-1} + \dots + \phi_p x_{t-p} + w_t + \theta_1 w_{t-1} + \dots + \theta_q w_{t-q} \]
We can write an ARMA(p,q) model using the backshift operator
\[ \phi_p (\mathbf{B}) x_t= \theta_q (\mathbf{B}) w_t \]
ARMA models are stationary if all roots of \(\phi_p (\mathbf{B}) > 1\)
ARMA models are invertible if all roots of \(\theta_q (\mathbf{B}) > 1\)
set.seed(123)
arma_mods <- vector("list", 4L)
### ARMA(3,1): phi[1] = 0.7, phi[2] = 0.2, phi[3] = -0.1, theta[1]= 0.5
arma_mods[[1]] <- arima.sim(list(ar=c(0.7, 0.2, -0.1), ma=c(0.5)), n=5000)
### ARMA(2,2): phi[1] = -0.7, phi[2] = 0.2, theta[1] = 0.7, theta[2]= 0.2
arma_mods[[2]] <- arima.sim(list(ar=c(-0.7, 0.2), ma=c(0.7, 0.2)), n=5000)
### ARMA(1,3): phi[1] = 0.7, theta[1] = 0.7, theta[2]= 0.2, theta[3] = 0.5
arma_mods[[3]] <- arima.sim(list(ar=c(0.7), ma=c(0.7, 0.2, 0.5)), n=5000)
### ARMA(2,2): phi[1] = 0.7, phi[2] = 0.2, theta[1] = 0.7, theta[2]= 0.2
arma_mods[[4]] <- arima.sim(list(ar=c(0.7, 0.2), ma=c(0.7, 0.2)), n=5000)
titles <- list(
expression("ARMA(3,1): "*phi[1]*" = 0.7, "*phi[2]*" = 0.2, "*phi[3]*" = -0.1, "*theta[1]*" = 0.5"),
expression("ARMA(2,2): "*phi[1]*" = -0.7, "*phi[2]*" = 0.2, "*theta[1]*" = 0.7, "*theta[2]*" = 0.2"),
expression("ARMA(1,3): "*phi[1]*" = 0.7, "*theta[1]*" = 0.7, "*theta[2]*" = 0.2, "*theta[3]*" = 0.5"),
expression("ARMA(2,2): "*phi[1]*" = 0.7, "*phi[2]*" = 0.2, "*theta[1]*" = 0.7, "*theta[2]*" = 0.2")
)
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(i in 1:4) {
plot.ts(arma_mods[[i]][1:50], las = 1,
main = "", ylab = expression(italic(x[t])))
mtext(side = 3, titles[[i]],
line = 0.2, adj = 0, cex = 0.7)
}
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(i in 1:4) {
acf(arma_mods[[i]][1:1000], las = 1,
main = " ")
mtext(side = 3, titles[[i]],
line = 0.2, adj = 0, cex = 0.8)
}
par(mfrow = c(2,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.7, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
### loop over orders of p
for(i in 1:4) {
pacf(arma_mods[[i]][1:1000], las = 1,
main = "")
mtext(side = 3, titles[[i]],
line = 0.2, adj = 0, cex = 0.8)
}
| Model | ACF | PACF |
|---|---|---|
| AR(p) | Tails off slowly | Cuts off after lag p |
| MA(q) | Cuts off after lag q | Tails off slowly |
| ARMA(p,q) | Tails off slowly | Tails off slowly |
If the data do not appear stationary, differencing can help
This leads to the class of autoregressive integrated moving average (ARIMA) models
ARIMA models are indexed with orders (p,d,q) where d indicates the order of differencing
For \(d > 0\), \(\{x_t\}\) is an ARIMA(p,d,q) process if \((1-\mathbf{B})^d x_t\) is an ARMA(p,q) process
For example, if \(\{x_t\}\) is an ARIMA(1,1,0) process then \(\Delta \{x_t\}\) is an ARMA(1,0) = AR(1) process
set.seed(123)
xx <- arima.sim(model=list(ar=0.5, sd=0.1), n=100)
yy <- cumsum(xx)
par(mfrow = c(2,3), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.6, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
plot.ts(yy, las = 1,
ylab=expression(italic(x[t])))
mtext(side = 3, "ARIMA(1,1,0)", line = 0.2, adj = 0, cex = 0.8)
acf(yy, main = "")
pacf(yy, main = "")
plot.ts(diff(yy), las = 1,
ylab=expression(paste(symbol("\xd1"), italic(x[t]))))
mtext(side = 3, "ARMA(1,0)", line = 0.2, adj = 0, cex = 0.8)
acf(diff(yy), main = "")
pacf(diff(yy), main = "")
Fortunately, the Box-Jenkins method is automated with the forecast package function forecast::auto.arima.
forecast::auto.arimaLet’s look at some examples of auto.arima in action using simulated data.
set.seed(1789)
nsims=50
# ARIMA(1,0,1)
xx <- arima.sim(model=list(ar = 0.5, ma = .3, sd=0.1), n=nsims)
# ARIMA(1,1,1)
yy <- ts(cumsum(xx))
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.6, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
plot.ts(xx, las = 1,
ylab=expression(italic(x[t])))
plot.ts(yy, las = 1,
ylab=expression(italic(y[t])))
# estimate when knowing the order
stats::arima(xx, order = c(1,0,1), include.mean = F)
Call:
stats::arima(x = xx, order = c(1, 0, 1), include.mean = F)
Coefficients:
ar1 ma1
0.3734 0.5488
s.e. 0.1746 0.1697
sigma^2 estimated as 0.5079: log likelihood = -54.45, aic = 114.9
stats::arima(yy, order = c(1,1,1), include.mean = F)
Call:
stats::arima(x = yy, order = c(1, 1, 1), include.mean = F)
Coefficients:
ar1 ma1
0.3754 0.5486
s.e. 0.1760 0.1709
sigma^2 estimated as 0.5171: log likelihood = -53.81, aic = 113.63
# estimate with forecast::auto.arima
## finds ARMA(1,1) with similar parameter estimates
forecast::auto.arima(xx)
Series: xx
ARIMA(1,0,1) with zero mean
Coefficients:
ar1 ma1
0.3734 0.5488
s.e. 0.1746 0.1697
sigma^2 estimated as 0.529: log likelihood=-54.45
AIC=114.9 AICc=115.42 BIC=120.64
forecast::auto.arima(yy)
Series: yy
ARIMA(1,1,1)
Coefficients:
ar1 ma1
0.3754 0.5486
s.e. 0.1760 0.1709
sigma^2 estimated as 0.5391: log likelihood=-53.81
AIC=113.63 AICc=114.16 BIC=119.3
But auto.arima isn’t a cure-all. Let’s try a different simulation with a more complicated model. We’ll see that auto.arima, like any automated procedure, should be used with caution.
set.seed(1789)
N=50
# ARIMA(2,0,2)
xx <- arima.sim(model=list(ar = c(0.5, 0.2), ma = c(.3, -.2), sd=0.2), n=N)
# ARIMA(2,1,2)
yy <- ts(cumsum(xx))
par(mfrow = c(1,2), mai = c(0.6,0.5,0.3,0.1), oma = c(0,0,1.5,0),
mgp = c(1.6, .5, 0), cex.axis = .8, cex.axis = 0.8, tcl = -.3)
plot.ts(xx, las = 1,
ylab=expression(italic(x[t])))
plot.ts(yy, las = 1,
ylab=expression(italic(y[t])))
# estimate when knowing the order
## pretty good estimates
stats::arima(xx, order = c(2,0,2), include.mean = F)
Call:
stats::arima(x = xx, order = c(2, 0, 2), include.mean = F)
Coefficients:
ar1 ar2 ma1 ma2
0.4367 0.2952 0.3614 -0.2329
s.e. 0.4308 0.3051 0.4169 0.1840
sigma^2 estimated as 0.7279: log likelihood = -63.47, aic = 136.95
stats::arima(yy, order = c(2,1,2), include.mean = F)
Call:
stats::arima(x = yy, order = c(2, 1, 2), include.mean = F)
Coefficients:
ar1 ar2 ma1 ma2
0.4432 0.2920 0.3580 -0.2314
s.e. 0.4322 0.3064 0.4172 0.1845
sigma^2 estimated as 0.7421: log likelihood = -62.69, aic = 135.38
# estimate with forecast::auto.arima
forecast::auto.arima(xx) # FAILS. finds ARIMA(0,1,0)
Series: xx
ARIMA(0,1,0)
sigma^2 estimated as 0.903: log likelihood=-67.03
AIC=136.06 AICc=136.14 BIC=137.95
forecast::auto.arima(yy) # FAILS. finds ARIMA(0,2,0)
Series: yy
ARIMA(0,2,0)
sigma^2 estimated as 0.9138: log likelihood=-65.94
AIC=133.89 AICc=133.98 BIC=135.76
Of course, this is simulated data. To be fair to auto.arima, I should run multiple distince simulations and see how often the auto.arima function correctly gets the model order and how well it estimates the model coefficients.
However, keep in mind, when it comes to working with real time series data one only gets to observe a single realized state of the world. Tread carefully.
Figure 1: Source: https://otexts.com/fpp2/seasonal-arima.html
where \(m\) is the number of observations per year.
library(forecast)
library(fpp2)
fpp2::euretail
Qtr1 Qtr2 Qtr3 Qtr4
1996 89.13 89.52 89.88 90.12
1997 89.19 89.78 90.03 90.38
1998 90.27 90.77 91.85 92.51
1999 92.21 92.52 93.62 94.15
2000 94.69 95.34 96.04 96.30
2001 94.83 95.14 95.86 95.83
2002 95.73 96.36 96.89 97.01
2003 96.66 97.76 97.83 97.76
2004 98.17 98.55 99.31 99.44
2005 99.43 99.84 100.32 100.40
2006 99.88 100.19 100.75 101.01
2007 100.84 101.34 101.94 102.10
2008 101.56 101.48 101.13 100.34
2009 98.93 98.31 97.67 97.44
2010 96.53 96.56 96.51 96.70
2011 95.88 95.84 95.79 95.94
Recall that seasonality violates stationarity. Let’s first try to remove the seasonality by taking the first seasonal difference (\(m=4\) so the seasonal diff is on the 4th lag).
euretail %>%
diff(lag=4) %>%
forecast::ggtsdisplay()
Still non-stationary. Let’s also take the first difference
euretail %>%
diff(lag=4) %>%
diff(lag=1) %>%
forecast::ggtsdisplay()
auto.arima with Seasonal data
fit <- euretail %>%
forecast::auto.arima() # assumes seasonal by default and runs tests
fit # finds an ARIMA(0,1,3)(0,1,1)[4] fits best
Series: .
ARIMA(0,1,3)(0,1,1)[4]
Coefficients:
ma1 ma2 ma3 sma1
0.2630 0.3694 0.4200 -0.6636
s.e. 0.1237 0.1255 0.1294 0.1545
sigma^2 estimated as 0.156: log likelihood=-28.63
AIC=67.26 AICc=68.39 BIC=77.65
forecast::checkresiduals(fit) # pretty baller fit
Ljung-Box test
data: Residuals from ARIMA(0,1,3)(0,1,1)[4]
Q* = 0.51128, df = 4, p-value = 0.9724
Model df: 4. Total lags used: 8
# plot forecast
fit %>%
forecast::forecast(h=12) %>%
forecast::autoplot()
library(fpp2)
forecast::autoplot(uschange[,1:2], facets=TRUE) +
ggplot2::xlab("Year") +
ggplot2::ylab("") +
ggplot2::ggtitle("Quarterly changes in US consumption and personal income")
trn <- head(uschange, -20) # exclude last 20 quarters
tst <- tail(uschange, 20) # keep training set
fit <- forecast::auto.arima(
trn[,"Consumption"],
xreg = trn[,"Income"]) # pass a vector of regressions
fit # model y[t] = b0 + b1*x[t] + e[t], e[t] ~ ARIMA(1,0,2)
Series: trn[, "Consumption"]
Regression with ARIMA(1,0,2) errors
Coefficients:
ar1 ma1 ma2 intercept xreg
0.6773 -0.5738 0.1978 0.5717 0.2542
s.e. 0.1336 0.1474 0.0781 0.0941 0.0536
sigma^2 estimated as 0.3448: log likelihood=-145.65
AIC=303.29 AICc=303.82 BIC=322
# pass the held out income data from the test set
fit %>%
forecast::forecast(h = nrow(tst), xreg = tst[,"Income"]) %>%
forecast::autoplot() +
forecast::autolayer(tst[,"Consumption"], series = 'actuals')
The problem is that to forecast beyond observed data requires forecasting the regressors xreg.
# use full data set
fit_consump <- forecast::auto.arima(
uschange[,"Consumption"],
xreg = uschange[,"Income"]) # pass a vector of regressions
fit_consump # model y[t] = b0 + b1*x[t] + e[t], e[t] ~ ARIMA(1,0,2)
Series: uschange[, "Consumption"]
Regression with ARIMA(1,0,2) errors
Coefficients:
ar1 ma1 ma2 intercept xreg
0.6922 -0.5758 0.1984 0.5990 0.2028
s.e. 0.1159 0.1301 0.0756 0.0884 0.0461
sigma^2 estimated as 0.3219: log likelihood=-156.95
AIC=325.91 AICc=326.37 BIC=345.29
# but to forecast consumption beyond whats observed, we need a forecast of income!
H = 24 # forecast horizon
fit_income <- forecast::auto.arima(uschange[,"Income"])
fc_income <- fit_income %>% forecast::forecast(h = H)
# extract expected outcome of univariate forecast for income
fc_mean <- fc_income$mean
# boring plot of expected outcome
fit_consump %>%
forecast::forecast(h = length(fc_mean), xreg = fc_mean) %>%
forecast::autoplot()
Instead of forecasting the expected outcome with a confidence interval, let’s simulate 30 income forecasts, estimate the expected consumption forecasts, and layer them on top of each other.
# add trailing zeros
add_zero <- function(v, spots = 2) {
sapply(v, function(x) {
len <- nchar(as.character(x))
if(len < spots) {
h <- paste(rep("0", spots - len), collapse = '')
sprintf('%s%d', h, x)
} else {
paste(x)
}
})
}
## income is also no very interesting
forecast::autoplot(fc_income) # ARIMA(0,0,0)
## instead, lets simulate possible paths.
N = 30 # simulations
sim_mat <- lapply(1:N, function(x) {
forecast:::simulate.Arima(fit_income, nsim = H)
}) %>%
Reduce('ts.intersect', .) %>%
'colnames<-'(paste0("sim", add_zero(1:N)))
fc_mat <- lapply(sim_mat, function(x) {
fit_consump %>%
forecast::forecast(h = length(x), xreg = x) %>%
'[['('mean')
}) %>%
Reduce('ts.intersect', .) %>%
'colnames<-'(paste0("fc", add_zero(1:N)))
# build baseline
fc_gg <- uschange[,"Consumption"] %>%
forecast::autoplot()
# layer on forecast using forecasted income
for(i in 1:ncol(fc_mat)) {
fc_gg <- fc_gg +
forecast::autolayer(fc_mat[,i], series = colnames(fc_mat)[i], alpha = .4)
}
# plot
fc_gg +
ggplot2::theme(legend.position = 'none')
Take a model that regresses on a trend variable \(t\)
\[ y_t = \beta_0 + \beta_1 t + \eta_t \]
d=0)d > 0)The stochastic trend occurs because when d > 1 the error term \(eta_t\), will carry information for \(y_{t-1}\) forward, not just residual error from the past, \(eta_{t-1}\).
For example, if \(eta_t\) is ARIMA(1,1,0), this model can be rewritten
\[\Delta y_{t-1} = \beta_1 + \Delta\]
ARIMA estimation relies on Maximum Likelihood Estimate (MLE). This matters because it means, as you get new data, parameters are re-estimated using the old series plus the new.
Below is an example where we split simulated ARIMA data into \(4\) even parts.
set.seed(1789)
nsims = 200
phis <- c(0.7, -0.5)
thetas <- c(0.3)
xx <- arima.sim(model=list(ar = phis, ma = thetas, sd = 0.1), n=nsims)
xxs <- lapply(1:4, function(i) head(xx, 50*(i)))
sapply(xxs, length)
[1] 50 100 150 200
# plot in reverse to overlay
layout(1)
plot(xx, type = 'n')
title( # so much code just for a fancy title...
bquote(
paste(
"ARIMA(2,0,1): ",
phi[1] , " = ", .(phis[1]), ", ",
phi[2] , " = ", .(phis[2]), ", ",
theta[1], " = ", .(thetas[1])
)
), cex = 0.8)
invisible(lapply(rev(seq.int(4)), function(i) lines(xxs[[i]], col = i)))
The true model is ARIMA(2,0,). You see below that the model fit to each expanding series has not only a changing order (p,d,q), but also changing coefficients.
This is largely a due to how forecast::auto.arima searches for and determines the best fitting model.
# say we KNOW these data should be stationary then set `max.d = 0`
fits <- lapply(xxs, forecast::auto.arima, max.d = 0)
descs <- sapply(fits, forecast:::arima.string)
ordrs <- lapply(fits, forecast::arimaorder)
coefs <- lapply(fits, coef)
descs
[1] "ARIMA(0,0,2) with zero mean"
[2] "ARIMA(2,0,1) with zero mean"
[3] "ARIMA(2,0,2) with non-zero mean"
[4] "ARIMA(2,0,2) with zero mean"
ordrs
[[1]]
p d q
0 0 2
[[2]]
p d q
2 0 1
[[3]]
p d q
2 0 2
[[4]]
p d q
2 0 2
coefs
[[1]]
ma1 ma2
1.0808321 0.3589922
[[2]]
ar1 ar2 ma1
0.6374053 -0.4176403 0.4635501
[[3]]
ar1 ar2 ma1 ma2 intercept
0.81426548 -0.44003112 0.09812646 -0.32767394 0.16187432
[[4]]
ar1 ar2 ma1 ma2
0.7975815 -0.4601103 0.1103089 -0.2928758
# i like to make a nested table then explode out the coefs and orders
tbl <-
tibble::tibble(descs, ordrs, coefs) %>%
dplyr::mutate(
ordrs = purrr::map(ordrs, ~as.data.frame(t(as.matrix(.x)))),
coefs = purrr::map(coefs, ~as.data.frame(t(as.matrix(.x))))
) %>%
tidyr::unnest(coefs) %>%
tidyr::unnest(ordrs)
tbl
# A tibble: 4 x 9
descs p d q ma1 ma2
<chr> <int> <int> <int> <dbl> <dbl>
1 ARIMA(0,0,2) with zero mean 0 0 2 1.08 0.359
2 ARIMA(2,0,1) with zero mean 2 0 1 0.464 NA
3 ARIMA(2,0,2) with non-zero mean 2 0 2 0.0981 -0.328
4 ARIMA(2,0,2) with zero mean 2 0 2 0.110 -0.293
ar1 ar2 intercept
<dbl> <dbl> <dbl>
1 NA NA NA
2 0.637 -0.418 NA
3 0.814 -0.440 0.162
4 0.798 -0.460 NA
But even when we impose the correct model order, ARIMA(2,0,1), the coefficient estimates will differ but certainly overlap in their standard errors.
# say we KNOW these data should be stationary then set `max.d = 0`
fits <- lapply(xxs, forecast::Arima, order = c(2,0,1), include.mean = F)
descs <- sapply(fits, forecast:::arima.string)
ordrs <- lapply(fits, forecast::arimaorder)
coefs <- lapply(fits, coef)
tbl2 <-
tibble::tibble(descs, ordrs, coefs) %>%
dplyr::mutate(
ordrs = purrr::map(ordrs, ~as.data.frame(t(as.matrix(.x)))),
coefs = purrr::map(coefs, ~as.data.frame(t(as.matrix(.x))))
) %>%
tidyr::unnest(coefs) %>%
tidyr::unnest(ordrs)
tbl2
# A tibble: 4 x 7
descs p d q ar1 ar2 ma1
<chr> <int> <int> <int> <dbl> <dbl> <dbl>
1 ARIMA(2,0,1) with zero mean 2 0 1 0.469 -0.200 0.559
2 ARIMA(2,0,1) with zero mean 2 0 1 0.637 -0.418 0.464
3 ARIMA(2,0,1) with zero mean 2 0 1 0.488 -0.404 0.472
4 ARIMA(2,0,1) with zero mean 2 0 1 0.517 -0.442 0.424